Estimating False Discovery Proportion Under Arbitrary Covariance Dependence.

نویسندگان

  • Jianqing Fan
  • Xu Han
  • Weijie Gu
چکیده

Multiple hypothesis testing is a fundamental problem in high dimensional inference, with wide applications in many scientific fields. In genome-wide association studies, tens of thousands of tests are performed simultaneously to find if any SNPs are associated with some traits and those tests are correlated. When test statistics are correlated, false discovery control becomes very challenging under arbitrary dependence. In the current paper, we propose a novel method based on principal factor approximation, which successfully subtracts the common dependence and weakens significantly the correlation structure, to deal with an arbitrary dependence structure. We derive an approximate expression for false discovery proportion (FDP) in large scale multiple testing when a common threshold is used and provide a consistent estimate of realized FDP. This result has important applications in controlling FDR and FDP. Our estimate of realized FDP compares favorably with Efron (2007)'s approach, as demonstrated in the simulated examples. Our approach is further illustrated by some real data applications. We also propose a dependence-adjusted procedure, which is more powerful than the fixed threshold procedure.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimating the Proportion of Nonzero Normal Means under Certain Strong Covariance Dependence by

The proportion of certain type of hypotheses is a key component of adaptive false discovery procedures in multiple testing. To date, a good estimator of the proportion of false null hypotheses under dependence is lacking. For multiple testing normal means, we develop a (uniformly) consistent estimator of the proportion of nonzero normal means when the dependent test statistics follow a joint no...

متن کامل

False Discovery Control Under Arbitrary Dependence

Multiple hypothesis testing is a fundamental problem in high dimensional inference, with wide applications in many scientific fields. In genome-wide association studies, tens of thousands of hypotheses are tested simultaneously to find if any genes are associated with some traits; in finance, thousands of tests are performed to see which fund managers have winning ability. In practice, these te...

متن کامل

False discovery control for multiple tests of association under general dependence

We propose a confidence envelope for false discovery control when testing multiple hypotheses of association simultaneously. The method is valid under arbitrary and unknown dependence between the test statistics and allows for an exploratory approach when choosing suitable rejection regions while still retaining strong control over the proportion of false discoveries.

متن کامل

Control Under Arbitrary Dependence

Multiple hypothesis testing is a fundamental problem in high dimensional inference, with wide applications in many scientific fields. In genome-wide association studies, tens of thousands of hypotheses are tested simultaneously to find if any genes are associated with some traits; in finance, thousands of tests are performed to see which fund managers have winning ability. In practice, these te...

متن کامل

False discovery rate for scanning statistics

The false discovery rate is a criterion for controlling Type I error in simultaneous testing of multiple hypotheses. For scanning statistics, due to local dependence, clusters of neighbouring hypotheses are likely to be rejected together. In such situations, it is more intuitive and informative to group neighbouring rejections together and count them as a single discovery, with the false discov...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of the American Statistical Association

دوره 107 499  شماره 

صفحات  -

تاریخ انتشار 2012